Average word length | # of sentences | Source |
---|---|---|
7.80 | 10 | http://dim-sotira3-amm.schools.ac.cy/programmata/mikroiekpedftes/?events=1=mikroiekpedeftes:-1 |
8.02 | 10 | http://choicefm.com.cy/tag/%ce%bd%ce%ad%ce%bf-%ce%b2%ce%af%ce%bd%cf%84%ce%b5%ce%bf-%ce%ba%ce%bb%ce%b9%cf%80/ |
8.09 | 11 | http://www.ccci.org.cy/events/2013-08/ |
8.21 | 16 | http://madtv.com.cy/index.php?pageid=11592 |
8.39 | 10 | http://www.ermesclub.com.cy/ecc_colredpoi.php?language=_gr=en=en=en=_gr=en=en=_gr=en=_gr=_gr=_gr=_gr=en=_gr=en=en=_gr=en=en=en=_gr |
8.40 | 12 | http://portal.ouc.ac.cy/Applicants.aspx?rid=3a6ea547-1865-42fe-be79-8694c63b05df |
8.46 | 40 | http://www.spectus.com.cy/spirits/liqueurs.html?=find_in_set_any=1[]=%20Mozart=find_in_set_any=1[]=%20Austria |
8.51 | 10 | http://www.digitallife.com.cy/diagonismos-zte-e8q-29857/comment-page-16 |
8.51 | 18 | http://www.newsit.com.cy/default.php?pname=Article=133972=1 |
8.56 | 15 | http://must.com.cy/index.php?pageaction=kat=1=10797=Y |
8.58 | 16 | http://mix.com.cy/gr/messages/electronics/photo_video_optics/equipment_for_rent/ads.html?s=0a |
8.59 | 10 | http://www.digitallife.com.cy/kerdiste-mia-psifiaki-embola-gr-60713/comment-page-1 |
8.59 | 11 | http://www.mlsi.gov.cy/mlsi/sid/sidv2.nsf/All/B1EAA5AA4878769FC2257B1900232DD9 |
8.61 | 10 | http://akinita.lidl.com.cy/cps/rde/xchg/SID-341271B0-4907ABCE/lidl_cy/hs.xsl/5304.htm |
8.64 | 10 | http://www.24hours.com.cy/society/itemlist/tag/%CE%94%CE%B5%CE%BB%CF%84%CE%AF%CE%BF%2024H.html |
8.64 | 13 | http://lekythos.lib.ucy.ac.cy/handle/10797/4517?locale-attribute=fr |
8.71 | 16 | http://www.ermesclub.com.cy/ecc_terms.php?language=_gr=_gr=_gr=_gr=en=_gr=en=_gr=_gr=_gr=_gr=en=_gr=en=_gr=_gr=_gr=_gr |
8.72 | 12 | http://www.panaris.com.cy/index.php?option=com_content=article=66:climateck=38:climateck=2=en=1 |
8.73 | 10 | http://www.spectus.com.cy/spirits/whisky.html?page=shop.browse=flypage.tpl=109=36=find_in_set_any=1[]=%20Bruichlladich=find_in_set_any=1[]=Under%20-%2015%E2%82%AC |
8.73 | 10 | http://metalib.unic.ac.cy:8331/V/CQDPT1KN1YB8VKCUCIXP2HPX9XQTB1MLH6I315YDMJL57BVBV7-09575?func=quick-1-details=000000168=DETAILS=000000010 |
8.76 | 17 | http://www.smartwebsites.com.cy/sitemap.php?lang=en=5=98=1=1=1 |
8.78 | 11 | http://choicefm.com.cy/tag/choice-fm/page/2/ |
8.79 | 17 | http://www.newsit.com.cy/default.php?pname=Article=132476=1 |
8.79 | 37 | http://kariera.lidl.com.cy/cps/rde/xchg/SID-B41DA44D-F19F5516/lidl_cy/hs.xsl/5304.htm |
8.79 | 10 | http://portal.ouc.ac.cy/Applicants.aspx?rid=4aba99dc-48c4-4608-b0e1-e6ac6de17d51 |
8.82 | 13 | http://www.public-cyprus.com.cy/product/aksesoyar/kiniton/thikes/thiki-lg-optimus-3d-puro-lgopt3dsblk-mayro/prod1460043pp/prod2810150pp |
8.83 | 10 | http://www.digitallife.com.cy/zte-s8q-diagonismos-85297/comment-page-18 |
8.83 | 10 | http://www.music.net.cy/easyconsole.cfm/page/read/n_id/5221 |
8.84 | 11 | http://www.balla.com.cy/nea-apo-to-24hcomcy/387917-la-santa-muerte-i-agia-tou-thanatou-pou-anatrepse-tin-katadiki-emporon-narkotikon |
8.84 | 12 | http://www.smartwebsites.com.cy/sitemap.php?action=login=en=5=1=7=1=1=1 |
Average word length | # of sentences | Source |
---|---|---|
18.50 | 12 | http://www.woodcraft.com.cy/results.php?=4=2=2=4=2=4=2=4=4=2=4=4=2=4=4=2=4=4=2=4=1=EN=4 |
16.68 | 12 | http://www.panagrotikos.org.cy/eng/sub_cat/dioikisi/symvoulio.htm |
16.60 | 26 | http://www.cytawholesale.com.cy/wholesale2010/page.php?pageID=23=16=39 |
16.54 | 11 | http://www.stockwatch.com.cy/nqcontent.cfm?a_name=announce_view_ase=117613=gr |
16.44 | 13 | http://library.ucy.ac.cy/el/subject-guides/economics/find-book-econ |
16.36 | 10 | http://www.panaris.com.cy/index.php?option=com_content=article=66:climateck=38:climateck=el=3=0 |
16.24 | 13 | http://www.careerjet.com.cy/w%CE%B1%CE%BD%CE%B1%CE%B6%CE%B7%CF%84%CE%B7%CF%83%CE%B7/%CE%B5%CF%81%CE%B3%CE%B1%CF%83%CE%AF%CE%B1?s=%CF%84%CE%BC%CE%B7%CE%BC%CE%B1%CF%84%CE%BF%CF%82=%CE%95%CF%85%CF%81%CF%8E%CF%80%CE%B7=10=p |
16.20 | 10 | http://www.scubadiversclub-tey.com.cy/gallery/displayimage.php?album=topn=5=66 |
16.05 | 10 | http://www2.parliament.cy/parliamentgr/003_02_biography/koukouma_koutra_skevi.htm |
15.97 | 10 | http://www.orangeshop.com.cy/gr/uel-13-14.html?___from_store=gr=60=8 |
15.82 | 10 | http://lcweb.ucy.ac.cy/flit/photd/flit_%20(137).html |
15.82 | 74 | http://www.eng.ucy.ac.cy/ceepostgradstudies/index_files/Page1302.htm |
15.82 | 15 | http://www.rooms.com.cy/index.php?pageid=166=5=ru |
15.71 | 10 | http://www.thalassamuseum.org.cy/el/arxeiodrastiriotitwn/117-epanasxediasmos-kypriakis-toyristikis-filoxenias.html |
15.69 | 16 | http://isostore.cys.org.cy/cys/home/catalogue_ics/catalogue_detail_ics.htm?ics1=65=060=10=43555 |
15.68 | 14 | http://www.ppu.org.cy/index.php/en/studies/media/21-heresies/occultism/342-occult.html?tmpl=component=1=default= |
15.63 | 15 | http://www.cplaw.com.cy/contacts |
15.59 | 21 | http://shop.superpets.com.cy/CAGE-AZURA.2403 |
15.57 | 20 | http://ucyweb.ucy.ac.cy/ddo/el/staff |
15.50 | 15 | http://ucy.ac.cy/research/el/research-centres-laboratories/research-centres-and-research-units |
15.48 | 10 | http://cypruslibrary.moec.gov.cy/ebooks/The_Cyprus_Gazette_1986_a/files/assets/basic-html/page520.html |
15.46 | 12 | http://www.moa.gov.cy/moa/agriculture.nsf/All/FC0895BACFB9122AC22579C2003F3B2B?OpenDocument |
15.45 | 10 | http://www.careerjet.com.cy/w%CE%B1%CE%BD%CE%B1%CE%B6%CE%B7%CF%84%CE%B7%CF%83%CE%B7/%CE%B5%CF%81%CE%B3%CE%B1%CF%83%CE%AF%CE%B1?s=%CE%BA%CE%B1%CF%84%CE%B1%CF%83%CF%84%CE%B7%CE%BC%CE%B1=%CE%95%CF%85%CF%81%CF%8E%CF%80%CE%B7=10=f=21 |
15.40 | 15 | http://www.cps-iodpc.com.cy/index.php/component/photocompetition/image/7117-frosty-and-foggy-morning |
15.37 | 19 | http://www2.ucy.ac.cy/admin/nomoi/volumeb/5.0.html |
15.34 | 11 | http://www.careerjet.com.cy/w%CE%B1%CE%BD%CE%B1%CE%B6%CE%B7%CF%84%CE%B7%CF%83%CE%B7/%CE%B5%CF%81%CE%B3%CE%B1%CF%83%CE%AF%CE%B1?s=%CF%80%CF%89%CE%BB%CE%B7%CF%84%CF%81%CE%B9%CE%B1=%CE%95%CF%85%CF%81%CF%8E%CF%80%CE%B7=10 |
15.33 | 27 | http://soundtech.com.cy/htc/1286-htc-t9292-hd7-black.html |
15.31 | 10 | http://www.lako.com.cy/page.php?id=11=EN=EN=11=EN=EN=11=EN=11=EN=11=EN=11=EN=11=EN=11=EN=11=EN=EN=11=EN=2=EN=11 |
15.29 | 22 | http://www.spectus.com.cy/wine/white.html?=find_in_set_any=1[]=%20%20Alsace[]=%20USA |
15.26 | 12 | http://www.careerjet.com.cy/w%CE%B1%CE%BD%CE%B1%CE%B6%CE%B7%CF%84%CE%B7%CF%83%CE%B7/%CE%B5%CF%81%CE%B3%CE%B1%CF%83%CE%AF%CE%B1?s=%CE%B5%CE%BA%CF%80%CE%B1%CE%B9%CE%B4%CE%B5%CF%85%CF%83%CE%B7%CF%82=%CE%95%CF%85%CF%81%CF%8E%CF%80%CE%B7=10 |
The problem addressed in this subsection (as well as the results) is similar to 6.4.1.1, but now we focus on average word length instead of average sentence length.
Measuring average word length strongly depends on tokenization. The usual tokenization might split the string “28.06.2005” into five parts “28 . 06 . 2005” of average length two. To avoid this, the number of words is counted as 1 + (number of blanks in the sentence).
select round(avg(length(sentence) / (1+ length(sentence) - length(replace(sentence," ","")))),2) as le, count(sentence) as cnt, source from sentences s, inv_so i, sources so where s.s_id=i.s_id and i.so_id=so.so_id group by source having cnt>=10 order by le limit 30;
6.4.2.2 Average logarithmic word rank for different sources
6.4.2.3 Sources consisting of many / few words with frequency 1
6.4.2.4 Sources with low / high average word length of rare words